-
-
Notifications
You must be signed in to change notification settings - Fork 18
Inflection 85: Add support for Malayalam #138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
grhoten
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like good progress. I've added some comments to help guide you to fix some test failures.
| എന്റെ,first,singular,genitive | ||
| എന്റെത്,first,singular,genitive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this example, it seems that എന്റെ means "my", and എന്റെത് means "mine". If that is the case, then you want to mirror what pronoun_en.csv says. You would want to use the following:
എന്റെ,first,singular,genitive,dependent
എന്റെത്,first,singular,genitive,independent
You would need to also add the following to grammar.xml.
<category name="determination">
<restrictions>
<restriction name="pos" value="pronoun"/>
<restriction name="case" value="genitive"/>
</restrictions>
<grammeme name="independent"/> <!-- e.g. mine -->
<grammeme name="dependent"/> <!-- e.g. my {object} -->
</category>
Do not use dependency= because that's for getting the grammemes (the grammatical category values) of the object being possessed, which is not relevant here.
| static dialog::DictionaryLookupInflector dictionaryInflector( | ||
| util::LocaleUtils::MALAYALAM(), | ||
| { | ||
| {CASE_NOMINATIVE, CASE_ACCUSATIVE, CASE_DATIVE, CASE_GENITIVE, CASE_LOCATIVE, CASE_INSTRUMENTAL, CASE_SOCIATIVE}, | ||
| {NUMBER_SINGULAR, NUMBER_PLURAL}, | ||
| {GENDER_MASCULINE, GENDER_FEMININE, GENDER_NEUTER}, | ||
| {FORMALITY_FORMAL, FORMALITY_INFORMAL}, | ||
| {CLUSIVITY_INCLUSIVE, CLUSIVITY_EXCLUSIVE}, | ||
| {PERSON_FIRST, PERSON_SECOND, PERSON_THIRD}, | ||
| {TENSE_PAST, TENSE_PRESENT, TENSE_FUTURE}, | ||
| {MOOD_INDICATIVE, MOOD_IMPERATIVE, MOOD_SUBJUNCTIVE} | ||
| }, | ||
| {}, | ||
| true | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a part of the class and not statically declared. It's preferable to minimize the statically allocated objects like this in case we ever want to minimize the memory being used when switching languages in the library.
Also you're missing the part of speech ordering. You should consider adding this as the first line to prioritize nouns first.
{GrammemeConstants::POS_NOUN(), GrammemeConstants::POS_ADJECTIVE(), GrammemeConstants::POS_VERB()},
| if (!inflectedOpt.has_value() && enableInflectionGuess) { | ||
| const bool isNoun = (posFeatureValue == u"noun"); | ||
| const bool isPluralTarget = (numberFeatureValue == NUMBER_PLURAL); | ||
|
|
||
| if (isNoun && isPluralTarget) { | ||
| const std::u16string token = displayValue->getDisplayString(); | ||
|
|
||
| std::u16string guessed = token; | ||
| if (!token.empty() && token.back() != u'്') { | ||
| guessed += u"കൾ"; | ||
| } else { | ||
| guessed += u"ങ്ങൾ"; | ||
| } | ||
|
|
||
| if (guessed != token) { | ||
| return new DisplayValue(guessed, constraints); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This indentation looks off. Can you please fix that?
| <!-- Genitive forms without dependency --> | ||
| <test><source case="genitive" person="first" number="singular">ഞാൻ</source><result>എന്റെ</result></test> | ||
| <test><source clusivity="exclusive" case="genitive" person="first" number="plural">ഞങ്ങൾ</source><result>ഞങ്ങളുടെ</result></test> | ||
| <test><source clusivity="inclusive" case="genitive" person="first" number="plural">നാം</source><result>നമ്മുടെ</result></test> | ||
| <test><source formality="formal" case="genitive" person="second" number="singular">താങ്കൾ</source><result>താങ്കളുടെ</result></test> | ||
|
|
||
| <!-- Indicative (default) --> | ||
| <test><source tense="present" mood="indicative" person="third" number="singular">വരിക</source><result>വരുന്നു</result></test> | ||
|
|
||
| <!-- Imperative --> | ||
| <test><source mood="imperative" person="second" number="singular" formality="informal">വരിക</source><result>വരു</result></test> | ||
| <test><source mood="imperative" person="second" number="singular" formality="formal">വരിക</source><result>വരുക</result></test> | ||
|
|
||
| <!-- Subjunctive --> | ||
| <test><source mood="subjunctive" person="third" number="singular">വരിക</source><result>വരുമെന്ന്</result></test> | ||
| <test><source mood="subjunctive" person="first" number="singular">ചോദിക്കുക</source><result>ചോദിക്കാമെന്ന്</result></test> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's uncommon to be this precise. Ideally you will want to specify the minimal grammemes. For example, you can pick an initial verb that is already subjunctive, third person, and singular, and then you only inflect it to plural without the other constraints. The DictionaryLookupInflector will typically take care of these other details.
| // Add case, number always | ||
| addIfNotEmpty(caseFeature); | ||
| addIfNotEmpty(numberFeature); | ||
|
|
||
| // Add gender only if pos is pronoun | ||
| if (posFeatureValue == u"pronoun") { | ||
| addIfNotEmpty(genderFeature); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very common to use. Though I'm a little surprised about the gender being supported by only pronouns. Hopefully that is intentional. In most languages, the gender doesn't apply when the plural form is used. I definitely recommend reusing the string constants available in GrammemeConstants. So you may want to replace u"pronoun". It makes it much easier when maintaining this code to see which implementations are using this grammeme.
| addIfNotEmpty(formalityFeature); | ||
| addIfNotEmpty(clusivityFeature); | ||
| addIfNotEmpty(personFeature); | ||
| addIfNotEmpty(tenseFeature); | ||
| addIfNotEmpty(moodFeature); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very uncommon to allow to inflect. Typically it's only used during lemmaless inflection, but not as a user provided constraint. I recommend removing these lines to simplify the usage and maintenance. Inflecting for POS, gender, number, and case are the typical use cases in all of the language implementations.
| for (const auto& suffix : suffixes) { | ||
| if (displayString.size() > suffix.size() && ends_with(displayString, suffix)) { | ||
| return new ::inflection::dialog::SpeakableString(GrammemeConstants::CASE_GENITIVE()); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks better. Instead of checking the length twice, how about checking it once, and use std::u16string::ends_with. It's used elsewhere in the code too. That's a part of newer versions of C++.
| inflection::dialog::SpeakableString MlCommonConceptFactory::quantifyType( | ||
| const ::inflection::dialog::SpeakableString& formattedNumber, | ||
| const SemanticFeatureConceptBase& semanticConcept, | ||
| bool useDefault, | ||
| Plurality::Rule countType) const | ||
| { | ||
| ::std::unique_ptr<::inflection::dialog::SpeakableString> speakableResult; | ||
| if (!useDefault) { | ||
| ::std::unique_ptr<SemanticFeatureConceptBase> semanticConceptClone(npc(semanticConcept.clone())); | ||
|
|
||
| if (Plurality::Rule::ONE == countType) { | ||
| semanticConceptClone->putConstraint(*npc(semanticFeatureCount), GrammemeConstants::NUMBER_SINGULAR()); | ||
| } else { | ||
| semanticConceptClone->putConstraint(*npc(semanticFeatureCount), GrammemeConstants::NUMBER_PLURAL()); | ||
| } | ||
|
|
||
| speakableResult.reset(semanticConceptClone->toSpeakableString()); | ||
| } | ||
|
|
||
| if (speakableResult == nullptr) { | ||
| speakableResult.reset(semanticConcept.toSpeakableString()); | ||
| } | ||
|
|
||
| return quantifiedJoin(formattedNumber, *npc(speakableResult.get()), {}, countType); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the default implementation. I suspect that this method be deleted, and this file should look similar to EnCommonConceptFactory.cpp.
| എന്റെ,first,singular,genitive,determination=dependent,exclusive,personal | ||
| എന്റേത്,first,singular,genitive,determination=independent,exclusive,personal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The determination= is unnecessary. You can remove it.
grhoten
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, the changes look about right. Some code cleanup is still needed to improve maintainability by others, and to handle edge cases.
Here are some additional tasks to work on before merging these changes.
- Please switch the dictionary_ml.lst and inflectional_ml.xml back to git lfs. It should be working again.
- Please switch the title of this pull request to a more informative summary, like "Inflection 85: Add support for Malayalam". You can also reference the issue number with "Fixes" so that the issue will be automatically closed when this is merged.
- Please update the summary to summarize what functionality is being implemented. The summary of the current state from your Google doc would be a good to put here.
- Please git squash these changes. The intermediate states aren't needed.
What hasn't been contributed are the RBNF rules for Malayalam. That is used by the quantify method of CommonConceptFactory. If you would like to contribute or work on that, that can be a separate submission to CLDR. I can go over those details, if you want to know more. This isn't a prerequisite for these changes, but it does help with the overall completeness of the implementation.
Thanks for doing this!
documents/how_to_add_new_language.md
Outdated
| NOTE: Take a look at [PR #40](https://github.com/unicode-org/inflection/pull/40) and [PR #111](https://github.com/unicode-org/inflection/pull/111) for example on how to add initial language support based on dictionary lookup only. | ||
| In general, to bootstrap your progress look for grammatically similar language that's already supported, e.g. if you are adding Serbian look for existing Russian implementation. | ||
| This will help you find most of the files you need to add/change and will speed up implementation of the rules and lexicons. | ||
| We recommend you spend around a week researching the language and all the different components of the language before even beginning to modify and add the files below. Look at all the files in the project such as tokenizers, configuration files, grammar files, and different lookup functions to see what you need. This will save you a lot of time in the end. We highly suggest you stray away from hardcoded logic and rely on the Dictionary Lookup. Look at all the grammemes, tokenizer logic, multi-word phrase handling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this paragraph is incomplete. Please finish the thought.
| <category name="animacy"> | ||
| <grammeme name="animate"/> | ||
| <grammeme name="inanimate"/> | ||
| <grammeme name="human"/> | ||
| </category> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the same indentation level as the other categories. The indentation is 4 spaces.
| # | ||
| tokenizer.implementation.class=DefaultTokenizer | ||
| tokenizer.nonDecompound.file=/org/unicode/inflection/tokenizer/ml/nondecompound.tok | ||
| tokenizer.decompound=^(ശ്രീ)(.+?)(ഗുരു|സര്ക്കാര്)$|^(.+?)(ഗുരു|സര്ക്കാര്)$|^(.+?)(ഉണ്ട്|ആണ്|ഇല്ല)$|^(.+?)(ഒടൊപ്പം|ഉടൻ|ഓടെ|ഓട്|ഒപ്പം|തന്നെ|പോലും|പോലെ|ഉം|യ്)$|^(.+?)(കളുടെ|ങ്ങളുടെ|ത്തിന്റെ|ൻ്റെ|ന്റെ|യുടേ|യുടെ|യാൽ|യിൽ|ഇൽ|ല്|ൽ|ക്ക്|മാർ|ങ്ങൾ|കൾ|നെ|യെ)$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ^ and $ should be removed. Those are automatically added by TokenExtractor. You'd be correct if those weren't automatically added.
|
|
||
| namespace inflection::dialog::language { | ||
|
|
||
| MlCommonConceptFactory::MlCommonConceptFactory(const ::inflection::util::ULocale& language) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is anything needed beyond the default constructor and destructor? This code looks a lot like the same behavior as EnCommonConceptFactory. If it's the default behavior, then please remove the redundant code.
| bool ends_with(const std::u16string& str, const std::u16string& suffix) { | ||
| if (suffix.size() >= str.size()) return false; | ||
| return std::equal(suffix.rbegin(), suffix.rend(), str.rbegin()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of writing this code, please consider using the ends_with on the u16string/u16string_view itself. Your guessFallbackNounInflection implementation has an example usage of using the string's ends_with.
| formalityFeature(*npc(model.getFeature(u"formality"))), | ||
| clusivityFeature(*npc(model.getFeature(u"clusivity"))), | ||
| personFeature(*npc(model.getFeature(GrammemeConstants::PERSON))), | ||
| tenseFeature(*npc(model.getFeature(u"tense"))), | ||
| moodFeature(*npc(model.getFeature(u"mood"))), | ||
| pronounTypeFeature(*npc(model.getFeature(u"pronounType"))), | ||
| determinationFeature(*npc(model.getFeature(u"determination"))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider switching all of the hard coded strings to GrammemeConstants. I like reuse.
| {GrammemeConstants::POS_NOUN(), GrammemeConstants::POS_VERB(), GrammemeConstants::POS_PRONOUN()}, | ||
| {CASE_NOMINATIVE, CASE_ACCUSATIVE, CASE_DATIVE, CASE_GENITIVE, CASE_LOCATIVE, CASE_INSTRUMENTAL, CASE_SOCIATIVE}, | ||
| {NUMBER_SINGULAR, NUMBER_PLURAL}, | ||
| {GENDER_MASCULINE, GENDER_FEMININE, GENDER_NEUTER}, | ||
| {FORMALITY_FORMAL, FORMALITY_INFORMAL}, | ||
| {CLUSIVITY_INCLUSIVE, CLUSIVITY_EXCLUSIVE}, | ||
| {PERSON_FIRST, PERSON_SECOND, PERSON_THIRD}, | ||
| {TENSE_PAST, TENSE_PRESENT, TENSE_FUTURE}, | ||
| {MOOD_INDICATIVE, MOOD_IMPERATIVE, MOOD_SUBJUNCTIVE} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Do adjectives inflect? I thought you said no previously. If so, then nothing more needs to change for these arguments.
| npc(npc(tokenizer.get())->createTokenChain(word))); | ||
|
|
||
| for (const auto& token : *tokenChain) { | ||
| if (dynamic_cast<const ::inflection::tokenizer::Token_Word*>(&token) != nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you could check token.isSignificant(). It's a little clearer when using that. You probably copied this code from English or a similar implementation. It's probably better to use isSignificant instead.
| #include <inflection/tokenizer/TokenExtractor.hpp> | ||
| #include <map> | ||
|
|
||
| class inflection::tokenizer::locale::ml::MlTokenizer final |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the changes to tokenizer/locale/ml. They seem to be unused and not needed.
| <!-- Malayalam Adjective Inflection Tests --> | ||
| <test><source gender="feminine">പുതിയ</source><result>പുതിയ</result></test> | ||
| <test><source gender="masculine">പുതിയ</source><result>പുതിയ</result></test> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these results changing? Am I missing something? What's being tested here? If it's not supposed to change on purpose, please add a comment about that to this section of the tests.
Also please indent this file like the other xml files. It makes it easier to navigate between them.
# This is the 1st commit message: Add Malayalam dictionary support # This is the commit message #2: Add Malayalam language support in LocaleUtils.hpp # This is the commit message #3: Add Malayalam locale support to LocaleUtils # This is the commit message #4: Add Malayalam language to the tests # This is the commit message #5: Add Malayalam locale group ml_IN # This is the commit message #6: ADD: Malayalam tokenizer configuration file # This is the commit message #7: Inflection-85: Add Git LFS config for Malayalam dictionary and XML files # This is the commit message #8: Add Malayalam inflection and pronoun tests # This is the commit message #9: Updated copyright line # This is the commit message #10: Updated copyright message # This is the commit message #11: Updated copright message # This is the commit message #12: Updated language grammar to include Malayalam # This is the commit message #13: Added pronouns for Malayalam # This is the commit message #14: Add ll GrammarSynthesizer files # This is the commit message #15: Add Malayalam grammar synthesizer # This is the commit message #16: Add Malayalam-specific CommonConceptFactory with lists and quantities # This is the commit message #17: Update document on how to add a new language, fixed errors # This is the commit message #18: Updated grammar.xml for Malayalam # This is the commit message #19: Update pronoun_ml.csv # This is the commit message #20: Updated all grammar synthesizer component for Malayalam # This is the commit message #21: Update Common Concept Factory files # This is the commit message #22: Updated tests for Malayalam # This is the commit message #23: Fix Malayalam grammar synthesis and remove count lookup function # This is the commit message #24: Updated Grammeme Constants files to include sociative case # This is the commit message #25: Temporary fix for GitHub
Inflection 85: Add language support and grammar synthesis for Malayalam Add Malayalam dictionary support Add Malayalam language support in LocaleUtils.hpp Add Malayalam locale support to LocaleUtils Add Malayalam language to the tests Add Malayalam locale group ml_IN ADD: Malayalam tokenizer configuration file Inflection-85: Add Git LFS config for Malayalam dictionary and XML files Add Malayalam inflection and pronoun tests Updated copyright line Updated copyright message Updated copright message Updated language grammar to include Malayalam Added pronouns for Malayalam Add ll GrammarSynthesizer files Add Malayalam grammar synthesizer Add Malayalam-specific CommonConceptFactory with lists and quantities Update document on how to add a new language, fixed errors Updated grammar.xml for Malayalam Update pronoun_ml.csv Updated all grammar synthesizer component for Malayalam Update Common Concept Factory files Updated tests for Malayalam Fix Malayalam grammar synthesis and remove count lookup function Updated Grammeme Constants files to include sociative case Temporary fix for GitHub Modified files to fix more test errors Modified files to fix more test errors Update files to fix errors Same file as before but with corrected indentations Update test files to reflect tokenization Added tokenizer files Added feedback on how to add a new language Modified files New changes to fix errors Updated Grammar Synthesizer file Fixed errors Made changes to fix errors Fix Common Concept Factory Reduce memory leaks
* Update versions.mk * Update cmake-multi-platform.yml * Update cmake-multi-platform.yml * Update cmake-multi-platform.yml
* Add Factory for MF2 Also add unit test and data driven unit test Test data based on inflection/test/resources/inflection/dialog/inflection/*.xml Test formatter while the <result> is not empty Test Selector while the <result> is empty, and there are one attribute which is not "exists". Uppercase the factory function Change copyright, remove iostream, add comments Fix build issue if the icu is < 77 Remove Data Driven test Fix Locale building Remove version Check * Reformat
* Fix memory leak in MF2Factory.cpp (issue/131) Need to delete the return of toSpeakableString() and getFeatureValue() return from InflectableStringConcept. Fix ICU cache * Use std::unique_ptr instead of raw pointer and delete * Fix per george feedback
* Inflection 134 std::unique_ptr for TokenExtractor * remove () * remove ()
* Update cmake-multi-platform.yml * Update ubuntu-memory-check.yml * Update ubuntu-memory-check.yml * Update cmake-multi-platform.yml
* Inflection 134 std::unique_ptr for DictionaryMetaData.cpp * fix
* Inflection 134 std::unique_ptr for DictionaryKeyIterator.cpp * add npc
#141) * Inflection 134 std::unique_ptr for KoGrammarSynthesizer_ParticleResolver.cpp * add npc
* Cache LFS content (fix issue/127) and fix icu cache by adding save action * only perform lfs action if missed the cache * Add ls to debug the content restored * Debug * Correct path * fix * Remove dupicate lines
* Try to add doxygen * Update doxygen * Update file * Fix Doxygen file * Add dos gen * Skip mac for doc gen * add doxgey * Use DenverCoder1/doxygen-github-pages-action * Tweak the configue path * Try to fix dependency * Add different yml for deploy * Remove extra lines * Add push * add debug * Change path * Fix * Change way to install doxygen * Fix * Add run to debug * push * Remove double $ * Remove
* Change namespace in message2 * Remove namespace in hpp file
Here is why we fix the doxygen in this way.
I discussed with Gemini about how to fix this issue by asking
why doxygen cannot recognize class like class INFLECTION_CLASS_API inflection::dialog::NumberConcept ? how to fix it?
Gemini answered:
Doxygen's primary job is to parse C++, C, Java, Python, and other code to generate documentation. When Doxygen fails to recognize a class like class INFLECTION_CLASS_API inflection::dialog::NumberConcept, it's almost always due to the preprocessor macro INFLECTION_CLASS_API.
Here's why this happens and how to fix it:
Why Doxygen Cannot Recognize It
Preprocessor Macro Obscuring the Class Definition: Doxygen has a built-in preprocessor. If INFLECTION_CLASS_API is a macro that expands to something like __declspec(dllexport) (common for Windows DLLs) or __attribute__((visibility("default"))) (common for GCC/Clang on Linux/macOS), Doxygen's parser, by default, might not correctly interpret the resulting code. It sees __declspec(dllexport) class ... instead of class ....
ENABLE_PREPROCESSING = NO: If ENABLE_PREPROCESSING is set to NO in your Doxyfile, Doxygen won't process any macros, and it will simply see INFLECTION_CLASS_API as part of the class name, leading to a parsing error.
Macro Expansion Issues: Even with preprocessing enabled, Doxygen's default macro expansion behavior might not be sufficient. It might not expand INFLECTION_CLASS_API in a way that reveals the underlying class keyword to its parser.
How to Fix It
The solution involves configuring Doxygen's preprocessor to correctly handle the INFLECTION_CLASS_API macro.
Enable Preprocessing:
Make sure ENABLE_PREPROCESSING is set to YES in your Doxyfile.
Code snippet
ENABLE_PREPROCESSING = YES
Define the Macro in Doxygen:
You need to tell Doxygen what INFLECTION_CLASS_API means so it can effectively ignore or replace it during parsing. This is done using the PREDEFINED tag in your Doxyfile.
If INFLECTION_CLASS_API is empty (or should be ignored by Doxygen):
This is often the case for API macros where they expand to nothing for documentation builds.
Code snippet
PREDEFINED = INFLECTION_CLASS_API=
This tells Doxygen to replace INFLECTION_CLASS_API with nothing.
If INFLECTION_CLASS_API expands to a __declspec or __attribute__:
You need to remove these compiler-specific attributes.
For __declspec(dllexport) or __declspec(dllimport):
Code snippet
PREDEFINED = __declspec(x)=
This tells Doxygen to replace __declspec(anything) with nothing.
For __attribute__((visibility("default"))) or similar:
Code snippet
PREDEFINED = __attribute__(x)=
This tells Doxygen to replace __attribute__(anything) with nothing.
If INFLECTION_CLASS_API is a more complex macro:
If it's something like #define INFLECTION_CLASS_API SOME_OTHER_MACRO class, you might need to combine PREDEFINED with MACRO_EXPANSION and EXPAND_ONLY_PREDEF.
For example, if INFLECTION_CLASS_API itself expands to a simple empty string or a known attribute, the above methods should work. If it's more complex, you might need to use MACRO_EXPANSION = YES and EXPAND_ONLY_PREDEF = YES, then explicitly define INFLECTION_CLASS_API in PREDEFINED to be empty or a simple class keyword if it's truly hiding class. However, it's generally better to make the macro itself expand to an empty string for Doxygen.
Example PREDEFINED entry (most common scenario):
Code snippet
PREDEFINED = INFLECTION_CLASS_API= \
__declspec(x)= \
__attribute__(x)=
(Note the \ for multi-line definitions).
Run Doxygen in Debug Mode (Preprocessor Output):
If the above doesn't work, you can see exactly what Doxygen's preprocessor is doing by running Doxygen with the -d Preprocessor flag:
Bash
doxygen -d Preprocessor your_Doxyfile
This will print the preprocessed output to the console, allowing you to see how INFLECTION_CLASS_API is being handled and debug your PREDEFINED settings.
Steps to Implement the Fix:
Open your Doxyfile: This is the configuration file for your Doxygen project.
Locate ENABLE_PREPROCESSING: Ensure it's set to YES. If not, change it.
Locate PREDEFINED: Add or modify this tag.
Find out what INFLECTION_CLASS_API expands to in your actual build system (e.g., by looking at compiler preprocessor output or just checking the macro definition in your code).
Based on that, add the appropriate definition to PREDEFINED to effectively remove or normalize INFLECTION_CLASS_API for Doxygen.
Re-run Doxygen: After modifying the Doxyfile, run Doxygen again to generate the documentation.
By correctly configuring Doxygen's preprocessor to handle INFLECTION_CLASS_API, you should be able to get it to recognize your inflection::dialog::NumberConcept class.
* Add comment for super class documentation Fix 147 * Fix by using @cond/@endcond * change // to ///
add documentation for operator ==, != and ()
correct the name of the parameter o -> displayValue
…7) (#158) * fix doxygen warning in SemanticFeatureModel_DisplayData.hpp Add document for operator ==, != and () * Update SemanticFeatureModel_DisplayData.hpp
* Update InflectableStringConcept.hpp * Update SemanticConceptList.hpp * Update SemanticFeature.hpp * Update SemanticFeature.hpp * Update SemanticFeatureConceptBase.hpp * Update SemanticFeatureModel.hpp * Update SemanticValue.hpp * Update SpeakableString.hpp * Update XMLParseException.hpp * Update LanguageGrammarFeatures_Feature.hpp * Update LanguageGrammarFeatures_GrammarCategory.hpp * Update LanguageGrammarFeatures_GrammarFeatures.hpp * Update LanguageGrammarFeatures_GrammarFeatures.hpp * Update LanguageGrammarFeatures_GrammarFeatures.hpp * Update inflection/src/inflection/dialog/InflectableStringConcept.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeature.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeature.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeatureConceptBase.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeatureConceptBase.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeatureConceptBase.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeatureConceptBase.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeatureConceptBase.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeatureModel.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeatureModel.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeatureModel.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeatureModel.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeatureModel.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeatureModel.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticValue.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SpeakableString.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/exception/XMLParseException.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/lang/features/LanguageGrammarFeatures_Feature.hpp Co-authored-by: George Rhoten <[email protected]> * Update inflection/src/inflection/dialog/SemanticFeatureConceptBase.hpp Co-authored-by: George Rhoten <[email protected]> * Update SemanticFeature.hpp formatting * Update inflection/src/inflection/dialog/SemanticFeatureConceptBase.hpp Co-authored-by: George Rhoten <[email protected]> * Update SemanticFeatureConceptBase.hpp formatting * Update SemanticFeatureModel.hpp * Update LanguageGrammarFeatures_Feature.hpp * Update SemanticFeature.hpp * Update LanguageGrammarFeatures_GrammarCategory.hpp * Update LanguageGrammarFeatures_GrammarFeatures.hpp * Update SemanticFeatureConceptBase.hpp * Update SemanticFeature.hpp * Update LanguageGrammarFeatures_GrammarFeatures.hpp * Update LanguageGrammarFeatures_GrammarCategory.hpp * Update LanguageGrammarFeatures_GrammarCategory.hpp --------- Co-authored-by: George Rhoten <[email protected]>
* Github workflow for Ubuntu Packaging * Github workflow for Ubuntu Packaging. * Github workflow for Ubuntu Packaging! * Refactor packaging workflow based on the feedback * Fix CPack versioning: derive major/minor/patch from INFLECTION_VERSION * Fixing Version Issue * Fixing the Cmake Version * Fix: Cpack Version Handling * Add CPack-based Ubuntu packaging and GitHub Actions release workflow
) * Github workflow for Ubuntu Packaging * Github workflow for Ubuntu Packaging. * Github workflow for Ubuntu Packaging! * Refactor packaging workflow based on the feedback * Fix CPack versioning: derive major/minor/patch from INFLECTION_VERSION * Fixing Version Issue * Fixing the Cmake Version * Fix: Cpack Version Handling * Add CPack-based Ubuntu packaging and GitHub Actions release workflow * Fix GitHub release workflow, fix cache key and release upload * Update workflow trigger to release on v* tags * Fix release workflow: use stable ICU cache key * Fix release workflow - Ubuntu * Fix release workflow - Ubuntu Packaging * Fix Ubuntu packaging workflow: ensure ICU env vars are exported for all build/test steps
…(issue 155) (#160) * Remove operator< and compareTo and add operator<=> * Update ULocale.hpp * Fix hash test. Ensure the comparator sort correctly.
* Use std::unique_ptr to avoid delete * Remove change to warm up code * Make destructor public instead of adding friend
* Turn on fail on doxygen warning * Turn on runnong doxygen on PR * Fix images error * Update doxygen-gh-pages.yml
…n-62 Integrate ar Wikidata into Unicode Inflection Inflection-61 Integrate he Wikidata into Unicode Inflection Inflection-60 Integrate hi Wikidata into Unicode Inflection Inflection-58 Integrate nb Wikidata into Unicode Inflection Inflection-56 Integrate nl Wikidata into Unicode Inflection Inflection-55 Integrate tr Wikidata into Unicode Inflection Inflection-54 Integrate ru Wikidata into Unicode Inflection Inflection-53 Integrate it Wikidata into Unicode Inflection Inflection-52 Integrate pt Wikidata into Unicode Inflection Inflection-51 Integrate fr Wikidata into Unicode Inflection Inflection-50 Integrate de Wikidata into Unicode Inflection (#167) Inflection-63 Integrate ko Wikidata into Unicode Inflection Inflection-62 Integrate ar Wikidata into Unicode Inflection Inflection-61 Integrate he Wikidata into Unicode Inflection Inflection-60 Integrate hi Wikidata into Unicode Inflection Inflection-58 Integrate nb Wikidata into Unicode Inflection Inflection-56 Integrate nl Wikidata into Unicode Inflection Inflection-55 Integrate tr Wikidata into Unicode Inflection Inflection-54 Integrate ru Wikidata into Unicode Inflection Inflection-53 Integrate it Wikidata into Unicode Inflection Inflection-52 Integrate pt Wikidata into Unicode Inflection Inflection-51 Integrate fr Wikidata into Unicode Inflection Inflection-50 Integrate de Wikidata into Unicode Inflection
* Update doxygen-gh-pages.yml * Update cmake-multi-platform.yml * Update ubuntu-memory-check.yml * Update cmake-multi-platform.yml * Update doxygen-gh-pages.yml * Update ubuntu-memory-check.yml * Update create-ubuntu-distribution-packaging.yml * Update create-ubuntu-distribution-packaging.yml fix comments * Update cmake-multi-platform.yml * Update doxygen-gh-pages.yml
d11e89e to
bf1d1d6
Compare
757ada7 to
19e2e50
Compare
|
The remainder of the review will be moved to the new pull request 175. |
|
This has been resubmitted as #176 |
The project expands Unicode Inflection to support Malayalam, allowing for grammatical inflections for nouns, pronouns, verbs, adjectives, and any other relevant words. Using data from Wikidata and custom inflection rules implemented in C++ and C, the project ensures Malayalam is correctly formatted across number, gender, case, tense, mood, and other aspects. This enhances language tools, making communication in Malayalam more accurate and accessible.
Includes Fixes #85